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Abstract 

The randomized fixe-composition with optimal decoding error exponents are studied [7], [8] for the 
finite alphabet interference channel (IFC) with two transmitter-receiver pairs. In this paper we investigate 
the capacity region of the randomized fixed-composition coding scheme. A complete characterization 
of the capacity region of the said coding scheme is given. The inner bound is derived by showing the 
existence of a positive error exponent within the capacity region. A simple universal decoding rule is 
given. The tight outer bound is derived by extending a technique first developed in [6] for single input 
output channels to interference channels. It is shown that even with a sophisticated time-sharing scheme 
among randomized fixed-composition codes, the capacity region of the randomized fixed-composition 
coding is not bigger than the known Han-Kobayashi [15] capacity region. This suggests that the average 
behavior of random codes are not sufficient to get new capacity regions. 

I. Introduction 

In [15], the capacity region of interference channel is studied for both discrete and Gaussian cases. In 
this paper we study the discrete interference channels W z \x,y an d ^z\x Y w ^ tn two P a hs of encoders 
and decoders as shown in Figure Q] The two channel inputs are x n S X n and y n £ y n , outputs are 
z n E Z n and z n € Z n respectively, where X, y, Z and Z are finite sets. We study the basic interference 
channel where each encoder only has a private message to the correspondent decoder. 
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Fig. 1. A discrete memoryless interference channel of two users 

Some recent progress on the capacity region for Gaussian interference channels is reported in [9], 
however, the capacity regions for general interference channels are unknown. We focus our investigation 
on the capacity region for a specific coding scheme: randomized fixed-composition codes while the 
error probability is defined as the average error over all code book with a certain composition (type). 
Fixed-composition coding is a useful coding scheme in the investigation of both upper [10] and lower 
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bounds of channel coding error exponents [4] for point to point channel and [14], [13] for multiple 
access (MAC) channels. Recently in [7] and [8], randomized fixed-composition codes are used to derive 
a lower bound on the error exponent for discrete interference channels. A lower bound on the maximum- 
likelihood decoding error exponent is derived, this is a new attempt in investigating the error exponents 
for interference channels. The unanswered question is the capacity region of such coding schemes. 

In this paper, we give a complete characterization of the interference channel capacity region for 
randomized fixed-composition codes. To prove the achievability of the capacity region, we prove the 
positivity everywhere in the capacity region of a universal decoding error exponent. This error exponent 
is derived by the method of types [3], in particular the universal decoding scheme used for multiple-access 
channels [14]. A better error exponent can be achieved by using the more complicated universal decoding 
rules developed in [13]. But since they both have the same achievable capacity region, we use the simpler 
scheme in [14]. To prove the the converse, that the achievable region matches the outer bound, we extend 
the technique in [6] for point to point channels to interference channels by using the known capacity 
region results for multiple-access channels. The result reveals the intimate relations between interference 
channels and multiple-access channels. With the capacity region for fixed-composition code established, 
it is evident that this capacity region is a subset of the Han-Kobayashi region [15]. 

The technical proof of this paper is focused on the average behavior of fixed-composition code books. 
However this fundamental setup can be generalized in the following three directions. 

• It is obvious that there exists a code book that its decoding error is no bigger than the average 
decoding error over all code books. Hence the achievability results in this paper guarantees the 
existence of a of deterministic coding scheme with at least the same error exponents and capacity 
region. More discussions are in Section III-EI 

• The focus of this paper is on the fixed-composition codes with a composition P, where P is 
a distribution on the input alphabet. This code book generation is different from the non-fixed- 
composition random coding [12] according to distribution P. It is well known in the literature that 
the fixed-composition code gives better error exponent result in low rate regime for point to point 
channels [4] and multiple-access channels [14], [13]. It is the same case for interference channels 
and hence the capacity region result in this paper applies to the non-fixed-composition random codes. 

• Time-sharing is a key element in achieving capacity regions for multi-terminal channels [2]. For 
instance, for multiple-access channels, simple time-sharing among operational rate pairs gives the 
entire capacity region. We show that the our fixed composition codes can be used to build a time- 
sharing capacity region for interference channel. More interestingly, we show that the simple time- 
sharing technique that gives the entire capacity region for multiple-access channels is not enough 
to get the largest capacity region, a more sophisticated time-sharing scheme is needed. Detailed 
discussions are in Section HVl 

The outline of the paper is as follows. In Section JI] we first formally define randomized fixed- 
composition codes and its capacity region and then in Section III-CI we present the main result of this 
paper: the interference channel capacity region for randomized fixed-composition code in Theorem Q] 
The proof is later shown in Section [III] with more details in the appendix. Finally in Section |IVl we 
argue that due to the non-convexity of the randomized fixed-composition coding, a more sophisticated 
time-sharing scheme is needed. This shows the necessity of studying the geometry of the code-books for 
interference channels. 

II. Randomized fixed-composition code and its capacity region 

We first review the definition of randomized fixed-composition code that is studied intensively in pre- 
vious works. Then the definition of the interference channel capacity region for such codes is introduced. 



2 



Then we give the main result of this paper: the complete characterization of the capacity region for 
randomized fixed-composition codes. 

A. Randomized fixed-composition codes 

A randomized fixed-composition code is a uniform distribution on the code books in which every 
codeword is from the type set with the fixed composition (type). 

First we introduce the notion of type set [2]. A type set T n (P) is a set of all the strings x n G X n 
with the same type P where P is a probability distribution [2]. A sequence of type sets T n C X n 
has composition Px if the types of T n converges to Px, i.e. lim N ( a \ T I = P x (a) for all a G X 
that P x (a) > and N(a\T n ) = for all a G X that P x (a) = 0, where N(a\T n ) is the number of 
occurrence of a in type T n . We ignore the nuisance of the integer effect and assume that nPx{a) is 
an integer for all a G X and nR x and nR y are also integers. This is indeed a reasonable assumption 
since we study long block length n and all the information theoretic quantities studied in this paper 
are continuous on the code compositions and rates. We simply denote by T n (Px) the length-n type set 
which has "asymptotic" type Px, later in the appendix we abuse the notations by simply writing x n G Px 
instead of x n £ T n {Px)- Obviously, there are \T n (Px)\ 2 " Rl many code books with fixed-composition 
Px and rate R x 

In this paper, we study the randomized fixed-composition codes, where each code book with all 
codewords from the fixed composition being chosen with the same probability. Equivalently, over all 
these code books, a code word for message i is uniformly i.i.d distributed on the type set T n (Px)- A 
formal definition is as follows. 

Definition 1: Randomized fixed-composition codes: for a probability distribution Px on X, a rate 
R x randomized fixed-composition-Px encoder picks a code book with the following probability, for 
any fixed-composition-P* code book 9 n = (9 n (l), 9 n (2), 9(2 nR *)), where 9 n (i) G T n {P x ), i = 
1, 2, 2 nRx , and 9 n {i) and 9 n (j) may not be different for i ^ j, the code book 9 n is chosen, i.e. 
x n (i) = 9 n (i), i = 1,2, 2 nR *, with probability 



\T n (Px)\, 

In other words, the choice of the code book is a random variable cx uniformly distributed on the index 
set of all the possible code books with fixed-composition Px- {1,2,3,..., \T n (Px)\ 2 " x }, while cx is 
shared between the encoder X and the decoders X and Y. 

The key property of the randomized fixed-composition code is that for any message subset {ii,i2, •■•*{} C 
{1, 2, 2 nRn: }, the code words for these messages are identical independently distributed on the type 
set of T n (P x ). 

For randomized fixed-composition codes, the average error probability P%JR X , Ry, Px, Py) for X 
is the expectation of decoding error over all message, code books and channel behaviors. 

P U R x ,R y ,Px,Py) ={ ] ^ m ) (jT^) (D 
E E E 2^7 E E + rn x ) 

where x n (m x ) is the code word of message m x in code book cx, similarly for y n (m y ), m x (z n ) is 
the decision made by the decoder knowing the code books cx and cy. 
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Fig. 2. Randomized fixed-composition capacity region IZ x (Px, Py) for X, the achievable region is the union of Region I 
and 77. 

B. Randomized fixed-composition coding capacity for interference channels 

Given the definitions of randomized fixed-composition coding and the average error probability in (Q]) 
for such codes, we can formally define the capacity region for such codes. 

Definition 2: Capacity region for randomized fixed-composition codes: for a fixed-composition Px 
and Py, a rate pair (R x ,R y ) is said to be achievable for X, if for all 5 > 0, there exists N$ < oo, s.t. 
for all n> Ns, 



We denote by lZ x (Px, Py) the closure of the union of the all achievable rate pairs. Similarly we denote 
by 1Z y (Px , Py) the achievable region for Y, and lZ xy (Px,Py) for (X,Y) where both decoding errors 
are small. Obviously 



We only need to focus our investigation on 1Z x (Px, Py), then by the obvious symmetry, both 1Z y (Px , Py) 
and n xy {P x ,P Y ) follow. 

C. Capacity region of the fixed-composition code, lZ x (Px, Py), for X 

The main result of this paper is the complete characterization of the randomized fixed-composition 
capacity region 1Z x (Px , Py) for X, as illustrated in (0), by symmetry, !Z xy (Px,Py) follows. 

Theorem 1: Interference channel capacity region 1Z x (Px, Py) for randomized fixed-composition codes 
with compositions Px and Py: 



P? {x) (R x ,R y ,Px,Py)<5 



(2) 




(3) 



TZ x (Px,Py) = {(R x ,R y ):0<R x <I(X;Z),0<R y } \J 

{(R x ,Ry) :0<R X < I(X; Z\Y),R X + R y < I(X, Y; Z)} 



(4) 
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Fig. 3. A typical randomized fixed-composition capacity region lZ xy (Px, Py) = TZ x (Px , PY)(~)TZ y (Px , Py) is the intersection 
of the dotted line and the solid lines, this capacity region is not necessarily convex. 



where the random variables in (0]), (X, Y, Z) ~ PxPyW z \ x .y- The region lZ x (Px, Py) is illustrated in 
Figure |2 

The achievable part of the theorem states that: for a rate pair (R x ,R y ) € 1Z x (Px, Py), the union of 
Region I and II in Figure |2l for all 5 > 0, there exists N$ < oo, s.t. for all n > Ng, the average error 
probability (Q~|) for the randomized code from compositions Px and Py is smaller than 5 for X: 

P: ix) (R x ,Ry,Px,Py)<6 

for some decoding rule. Region II is also the multiple-access capacity region for fixed-composition codes 
(Px,Py) for channel W z \xy- 

The converse of the theorem states that for any rate pair (R x ,R y ) outside of lZ x (Px, Py), that is 
region III, IV and IV in Figure |2l there exists 5 > 0, such that for all n, 

P: {x) (R x ,R y ,P x ,Py)>5 

no matter what decoding rule is used. Note that the definition of the error probability P3 x -\ {Rx,Ry, Px , Py ) 
defined in ([TJ) 

The proof of Theorem [TJ is in Section JII] 

D. Necessities of more sophisticated time-sharing schemes 

In the achievability part of Theorem [TJ we prove that the average error probability for X is arbitrarily 
small for a randomized fixed-composition code if the rate pair (R x ,R y ) is inside the capacity region 
1Z x (Px, Py)- For interference channels, it is obvious that the rate region for both X and Y is: 

n xy (P x ,Py) = n x (P x ,Py)nn y (Px,Py), (5) 

where lZ y (Px,Py) is defined in the same manner as 1Z x (Px,Py) but the channel is W^ XY instead 
of W z \ X y as shown in Figure [TJ A typical capacity region 1Z xy (Px,Py) is shown in Figure [3j It is not 
necessarily convex. 
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However, by a simple time-sharing between different rate pairs for the same composition, we can 
convexify the capacity region. Then the convex hull of the union of all such capacity regions of different 
compositions gives a bigger convex achievable capacity region. This capacity region of the interference 
channel is 

CONVEX I |J H xy (Px,P Y ) 

\Px,Py 

It is tempting to claim that the above convex capacity region is the largest one can get by time- 
sharing the "basic" fixed-composition codes as multiple-access channels shown in [2]. However, as will 
be discussed later in Section [TV] it is not the case. A more sophisticated time-sharing gives a bigger 
capacity region. 

This is an important difference between interference channel coding and multiple-access channel coding 
because the fixed-composition capacity region is convex for the latter and hence the simple time-sharing 
gives the biggest capacity region [2]. Time-sharing capacity is detailed in Section JV] 

E. Existence of a good code for an interference channel 

In this paper we focus our study on the average (over all messages) error probability over all code 
books with the same composition. For a rate pair (R x ,R y ), if the average error probability for X is 
smaller than 5, then obviously there exists a code book such that the error probability is smaller than 
5 for X. This should be clear from the definition of error probability P™^(R X , R y , Px, Py) in ©. In 
the following example, we illustrate that this is also the case for decoding error for both X and Y. We 
claim without proof that this is also true for "uniform" time-sharing coding schemes later discussed in 
Section [TV] The existence of a code book that achieves the error exponents in the achievability part of 
the proof of Theorem Q] can also be shown. The proof is similar to that in [12] and Exercise 30 (b) on 
page 198 [5]. 

Similar to the error probability for X defined in £T|), we define the average joint error probability for 
X and Y as 




{J2 W z]XY (z n \x n (m x ),y n (m y ))l(m x (z n ) + m x ) 
+ J^W^ XY (~z n \x n (m x ),y n (m y ))l(m y (~z n ) + m y )} 

2™ 

For a rate pair {R x , R y ) G TZ xy (P x ,Py) = 1Z x (P x ,Py) fl fcy(Px,Py)- We know that for all 5 > 0, 
there exists N& < oo, s.t. for all n > N$, the average error probability is smaller than 5 for user X and 
user Y: 

P^(R X , R y , Px,Py) < 8 and P^(R X , R y , Px,Py) < $■ It is easy to see that the average joint error 
probability for user X and Y can be bounded by: 

P?( X y)(Rx,Ry,Px,Py) = P: {x) (R x ,Ry,Px,Py)+P: {y) (R x ,Ry,Px,Py) 

< 25 (7) 

From ([6]), we know that P^ xy ^(R x , R y , Px , Py) is the average error probability of all (Px, Py) -fixed- 
composition codes. Together with (|7]), we know that there exists at least one code book such that the 
error probability is no bigger than 25. 
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Note, the converse of the randomized coding does not guarantee that there is not a single good fixed- 
composition code book. The converse claims that, the average (over all code books with the composition) 
decoding error probability does not converge to zero if the rate pair is outside the capacity region in 
Theorem [j] 

III. Proof of Theorem Q] 

There are two parts of the theorem, achievability and converse. The achievability part is proved by 
applying the classical method of types in point to point channel coding and MAC channel coding for 
randomized fixed-composition code. The converse is proved by extending the technique first developed 
in [6] for point to point channels to interference channels. 

A. Achievability 

We show that in the interior of the capacity region, i.e. the union of Region I and 77 in Figure |2j 
a positive error exponent is achieved by applying the randomized fixed-composition coding defined in 
Definition Q] In Sections IIII-A. 1 1 and IIII-A.2I we describe the universal decoding rules for Region II and 
/ respectively. We then present the error exponent results in Lemma Q] in Section IIII-A.3I and Lemma |2] 
in Section IIII-A.4I that covers Region II and / respectively. Then in Lemma [3] in Section IIII-A.51 we 
show that these error exponents are positive in the interior of the capacity region 7Z x (Px, Py) and hence 
conclude the proof of the achievability part in Theorem [TJ 

1 ) Decoding rule in Region II: In Region II, we show that decoder X can decode both message m x 
and m y with small error probabilities. This is essentially a multiple-access channel coding problem. We 
use the technique developed in [5] to derive the positive error exponents that parallel to those in [14]. 
The decoder is a simple maximum mutual information^ decoder [5]. This decoding rule is universal in 
the sense that the decoder does not need to know the multiple access channel Wz\xy- We describe the 
decoding rule here, the estimate of the joint message is the message pair such that the input to the channel 
Wz\xy an d the output of the channel have the maximal empirical mutual information, i.e.: 

(m x (z n ),m v (z n ))= argmax I(z n ; x n (i), y n (j)) (8) 

i6{l,2,...,2"«*}je{l,2,...,2™ R s} 

where z n is the channel output and x n (i) and y n (j) are the channel inputs for message i and j respectively. 
I(z n ;x n ,y n ) is the empirical mutual information between z n and (x n ,y n ), the point to point maximal 
mutual mutual information decoding is studied in [5]. 

If there is a tie, the decoder can choose an arbitrary winner or simply declare error. In Lemma [T] 
we show that by using the randomized fixed-composition encoding and the maximal mutual information 
decoding, a non-negative error exponent is achieved in Region II. 

2) Decoding rule in Region I: In Region /, decoder X only estimates m x by treating the input of 
encoder Y as a source of random noises. This is essentially a point to point channel coding problem. 
The channel itself has memory since the input of encoder Y is not memoryless. Similar to the multiple 
access channel coding problem studied in Region II, we use a maximal mutual information decoding 
rule: 

fh x (z n ) = argmax I(z n ;x n (i)) (9) 
ie{i,2,...,2"«-} 

'A more sophisticated decoding rule based on minimum conditional entropy decoding for multiple-access channel is developed 
in [13], it is shown that this decoding rule achieves a bigger error exponent in low rate regime. The goal of this paper is, however, 
not to derive the tightest lower bound on the error exponent. We only need a coding scheme to achieve positive error exponent 
in the capacity region in Theorem [JJ Hence we use the simpler decoding rule here. 
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In Lemma |2 we show that by using the randomized fixed-composition encoding and the maximal mutual 
information decoding, a non-negative error exponent is achieved in Region I. 

3 ) Lower bound on the error exponent in Region II: 

Lemma 1: (Region II) Multiple-access channel error exponents (joint error probability). For the ran- 
domized coding scheme described in Definition [T] and the decoding rule described in ([8]), the decoding 
error probability averaged over all messages, code books and channel behaviors is upper bounded by an 
exponential term: 



Pr((m x ,m y ) ^ (rh x ,m y )) 

(10) 



s 2"K* , N 2 nR v 



\T n {Px)\) \\T n {P Y )\. 

EEiE^rEE W zlXY (z n \x n (m x ),y n (m y ))l ((m x (z n ),m y (z n )) + (m x , m y )) 



< 2- n{E -^\ (11) 
e n converges to zero as n goes to infinity, and E = mii\{E xy , E x i y , E y i x }, where 
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— Rx\ + 


E 
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QxYZ 


min 

■ Qx=Px,( 


. p D(Qz\xy\ 

4y—±y 


\W\Qxy) 


+ D(Qxy 


\\Px xPy) + \Iq(Y;Z\X) 


~Ry\ + 



where \t\ + = max{0,t} and the random variables (X,Y,Z) ~ Qxyz in Iq(X; Z\Y), Iq(Y; Z\X) and 
Iq(X,Y;Z). 

Remark 1: it is easy to verify that D(Q Z \ XY \\W\Qxy) + D(Qxy \\Px x Py) = P j (Qxyz\\Px x P Y x 
W), so the expressions for the error exponents can be further simplified. We use the expressions similar 
to those in [14] because they are more intuitive. 

Remark 2: The proof parallels that in [14] which is in turn an extension to the point to point channel 
coding problem studied in [5[. The method of types is the main tool for the proofs. The difference is 
that we need to show the lower bound to the average error probability instead of showing the existence 
of a good code book in [14]. Without giving details, we follow Gallager's proof in [12] and claim the 
existence of a good code with the same error exponent as that in [14] as a simple corollary ofLemma\l\ 

Proof: First we have an obvious upper bound on the error probability 

Pr((m x ,m y ) ^ (m x ,m y )) 
= Pr(m x ^ fh x , m y / fh y ) + Pr(m x ^ fh x , m y = fh y ) + Pr(m x = fh x , m y / fhy) 
< Pr(m x ^ rh x ,m y / fh y ) + Pr(m x ^ rh x \m y = fh y ) + Pr(m y / fh y \m x = fh x )) (12) 

The inequality O follows the equality P(A,B) = P(A\B)P(B) < P(A\B). Now we upper bound 
each individual error probability in (fT2l) respectively by exponentials of n. We only need to show that 

Pr{m x + m x ,m y ± fh y ) < 2~ n ^- e -\ (13) 
Pr(m x / m x \m y = m y ) < 2~ n ^y-^\ (14) 
and Pr(m y f m y \m x = fh x ) < 2" n (^i-- £ "). (15) 
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We prove (fT3l) and (fT4b . (fT5T ) follows (fT4l ) by symmetry. The proofs are in Appendix |A] where a standard 
method of type argument is used. □ 

4 ) Lower bound on the error exponent in Region I: 

Lemma 2: (Region /) point to point channel coding error exponent (decoding X only). For the 
randomized coding scheme described in Definition [I] and the decoding rule described in (O, the decoding 
error probability averaged over all messages, code books and channel behaviors is upper bounded by an 
exponential term: 



Pr(m x / m a 



1 \ 2 ""V 1 ^ 



\T»{Px)\J \\T n {P Y )\. 
£ £ 2^57 £ zk £ £ ^z|xy(^|x"(m,), l/ m K))l (m^") / m.) 

< 2~ n( - E *- £ "\ (16) 
e n converges to zero as n goes to infinity, and 

E x = mm D(Q zlXY \\W\Q XY ) + D(Q XY \\P x xP Y ) + \I Q (X;Z)-R x \+ 

Q XY z'-Q x — Px : Qy — Py 

Proof: We give a unified proof for (fT3l) . (fT4l ) and (fT6l ) in Appendix [A} □ 

With Lemma Q] and Lemma 12 we know that some non-negative error exponents can be achieved for the 
randomized (P x ,Py) fixed-composition code if the rate pair (R x ,R y ) € 1Z x (P x ,Py). This is because 
both Kullback-Leibler divergence and | • | + are always non-negative. Now we only need to show the 
positiveness of those error exponents when the rate pair is in the interior of 1Z X {P X , Py). 

5) Positiveness of the error exponents: 

Lemma 3: For rate pairs (R x , R y ) in the interior of 1Z X {P X , Py) defined in Theorem [T] 

max{mm{E xy , E x \ y , E y \ x }, E x } > 0. 

More specifically, we show two things. First, if R x < I(X,Z), where (X,Z) ~ P x x Py x Wz\ X y, 
then E x > 0. This covers Region /. Secondly, if R x < I(X, Z\Y), R y < I(Y, Z\X) and R x + R y < 
I(X, Y; Z), where (X, Y, Z) ~ P x x Py x Wz\xy> men rninji?^, E x \ y , E y \ x } > 0, this covers Region 
11. 



Proof: First, suppose that for some R x < I(X, Z), E x < 0. Since both Kullback-Leibler divergence 
and | • | + are non-negative functions, we must have E x = and hence there exists a distribution Q X yz, 
s.t. Q x = P x , Qy = Py and all the individual non-negative functions are zero: 

D(Q X y\\P X x Py) = 
D(Q z \xy\\W\Qxy) = 
\I Q (X;Z)-R X \+ = 

The first equation tells us that Qxy = Px x Py. Then the second equation becomes D(Q z \xy \\W\P X x 
Py) = 0, this means that Q z \xy x Px x Py = W x P x x P Y , so Iq(X;Z) = I(X;Z) where 
the random variables (X,Y,Z) ~ Px x Py x Wz\xy m P{X;Z). Now the third equation becomes 
\I(X;Z) — R x \ + = which is equivalent to I(X;Z) < R x , this is a contradiction to the fact that 

R x < I(X, Z). 
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Secondly, suppose that for some rate pair (R x ,R y ) in Region II, i.e. R x < I(X,Z\Y), R y < 
I(Y,Z\X) and R x + R y < I(X,Y;Z) and mm{E xy ,E x \ y ,E y \ x } < 0, then mm{E xy = or E x \ y = 
or E y \ x } = 0. Following exactly the same argument as that in the first part of the proof of Lemma |3l we 
can get contradictions with the fact that the rate pair (R x ,R y ) is in the interior of Region II. □ 

From the above three lemmas, we conclude that the error probability for decoding message X is upper 
bounded by 2~ n( - E - e ^ for all (R x , Ry) e TZ x (P x ,Py), where E > and lim e n = 0. Hence the error 

n— >oo 

probability converges to zero exponentially fast for large n. This concludes the achievability part of the 
proof for Theorem [TJ 

B. Converse 

We show that the average decoding error of Decoder X does not converge to zero with increasing n 
if the rate pair (R x ,R y ) is outside the capacity region 1Z x (Px , Py) shown in Figure |2] There are three 
parts of the proof for Regions V, IV and III respectively. 

1 ) Region V: First, we show that in Region V the average error probability does not converge to zero 
as block length goes to infinity. This is proved by using a modified version of the reliability function for 
rate higher than the channel capacity [6]. 

Lemma 4: Region V, the average error probability for X does not converge to with block length n 

if R x > I(X; Z\Y), where (X, Y, Z) ~ P x x P Y x W Z \ XY . 

Proof: It is enough to show the case where there is only one message for Y and encoder Y sends 
a code word y n with composition Py. The code book for encoder X is still uniformly generated among 
all the fixed-composition-Px code books. In the rest of the proof, we investigate the typical behavior of 
the codewords x n and modify the Lemma 3 and Lemma 5 from [6] to show that 

Pr(m x + m x ) = P? (x) (R x , Ry, P X ,Py) > \ (17) 
for large n. The details of the proof are in Appendix |Bj □ 

2) Region IV: The more complicated case is in Region IV. We show that the decoding error 
probability for user X does not converge to zero with block length n. The proof is by contradiction. 
The idea is to construct a decoder that decodes both message m x and message m y correctly with high 
probability, if the decoding error for m x converges to zero. Then again by using a modified proof used 
in proving the reliability function for rate higher than channel capacity in [6], we get a contradiction. 

Lemma 5: Region IV, the average error probability for X does not converge to with block length n 

if R x < I(X; Z\Y), R y < I(Y; Z\X) and R x +R y > I{X, Y; Z) where (X, Y, Z)~P x xP Y x W Z \ XY . 

Proof: Suppose that 

Pr(m x . + m x ) = P? [x) (R x , Ry, Px,Py) < S n (18) 
where 8 n goes to zero with n. Let decoder X decode m y by the same decoding rule devised in ©: 

fh y {z n )= argmax I(z n ;x n (m x (z n )),y n (j)). (19) 

j'6{1,2,...,2" h h} 

The decoding error for either message at decoder X is now: 

Pr ((m x , fh y ) ^ (m x ,m y )) = Pr(m x . / m x ) + Pr(m x . = m x , m y ^ m y ) 

< Pv(m x ^m x ) + Pr(m y / m y \fh x = m x ) (20) 
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Given fh x = m x , ( fl~9l ) becomes 

fh y (z n ) = argmax I(z n ; x n (m x ), y n (j)). (21) 

i6{l,2,...,2" n *} 

So the second term in the RHS of (f20l >. Pr(m y 7^ m y |m x = m x ), can be upper bounded as shown 
in (Q~4]). Substitute the upper bounds flU) and £[H]) into d20j), we have: 

Pr((m x ,m y ) ^ (m x ,m y )) < 5 n + 2~< E ^~ e ^ (22) 

This upper bound (1221 converges to as n goes to infinity. However in Appendix |B] we show that 

P ?( X y){Rx,Ry,Px,Pv) = PT((fh x ,m y ) ± (m x ,m y )) > i (23) 

This is contradicted to (l22l . □ 

J J Region III: This is a corollary of Lemma [5] This is intuitively obvious since for each rate pair 
(R x , Ry) in Region 77/, we can find a rate pair (R x ,R y ) in Region IV such that R y > R y . We construct 
a contradiction as follows. For a (R x , R y ) decoder, we can construct a new decoder for (R x , R' ) where 
Ry < R y , by revealing a random selection of a (R x , R y ) code book that is the superset of the (R x , R' y ) 
code book to the (R x ,R y ) decoder and accept the estimate of the (R x ,R y ) decoder as the estimate for 
the (R x ,R' y ) decoder. If the average error probability is small for the (R x ,R y ) code books, the average 
error probability is small for this particular (Rx, R' y ) decoder as well, this is a contradiction to Lemma [5] 
Hence the decoding error for encoder X does not converge to with n if the rate pair (R x ,R y ) is in 
Region III. □ 

This concludes the converse part of the proof for Theorem Q] 

IV. Discussions on Time-sharing 
The main result of this paper is the randomized fixed-composition coding capacity region for X that is 
R-x(Px, Py) shown in Figure [2] So obviously, the interference channel capacity region, where decoding 
errors for both X and Y are small, is the intersection of TZ x (Px, Py) and lZ y (Px , Py ) where 7Z y (Px , Py) 
is defined in the similar way but with channel W^ XY instead of Wz\xy- The intersected region defined 
in ([5]), TZ xy (Px, Py), is in general non-convex as shown in Figure [3] Similar to multiple-access channels 
capacity region, studied in Chapter 15.3 [2], we use this capacity region R- xy (Px,Py) as the building 
blocks to generate larger capacity regions. 

A. A digression to MAC channel capacity region 

Before giving the time-sharing results for interference channels and show why the simple time-sharing 
idea works for MAC channels but not for interference channels, we first look at H x (Px, Py) in Figure [2] 
Region II is obviously the multiple access channel Wz\xy region achieved by input composition 
(P x ,Py) at the two encoders, denoted by 7£™ ac (.Px x P Y ). In [2], the full description of the MAC 
channel capacity region is given in two different manners: 



CONVEX (J K™ ac (P x x P Y ) = CLOSURE \J n™ c (P x \u x P Y[U x P v ) 

\Px,Py ) \Pu,Px\u,Py\u ) 

where R™ y ac (P x \u x Py\u x P v ) = {(Rx, Ry) ■ Rx < I(X; Z\Y,U), R y < I(Y; Z\X,U),R X + R y < 
I(X,Y; Z\U)} and U is the time-sharing auxiliary random variable and \U\ = 4. 
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The LHS of (l24b is the convex hull of all the fixed-composition MAC channel capacity regions. The 
RHS of (1241 is the closure (without convexification) of all the time-sharing MAC capacity regions.The 
equivalence in d24b is non-trivial, it is not a consequence of the tightness of the achievable region. It 
hinges on the convexity of the "basic" capacity regions 1Z™y C (Px, Py)- As will be shown in Section ITV-C I 
this is not the case for interference channels, i.e. d24b does not hold anymore. 

B. Simple time-sharing capacity region and error exponent 

The simple idea of time-sharing is well studied for multi-user channel coding, broadcast channel coding. 
Whenever there are two operational points (R^, R y ), (i?^, Ry), while there exist two coding schemes to 
achieve small error probability at each operational point, one can use An amount of channel uses at 
(Rx,R y ) with coding scheme 1 and (1 — A)n amount of channel uses at (R%.,Ry) with coding scheme 
2. The rate of this coding scheme is {aR\ + (1 — a)R%., aR y + (1 — a)Ry) and the error probability is 
still smalj^l (no bigger than the sum of two small error probabilities). This idea is easily generalized to 
more than 2 operational points. 

This simple time sharing idea works perfectly for MAC channel coding as shown in (l24l) . The whole 
capacity region can be described as time sharing among fixed-composition codes where the fixed- 
composition codes are building blocks. If we extend this idea to interference channel, we have the 
following simple time sharing region as discussed in Section III-DI 

CONVEX I [J K xy (P x ,Py) I = CONVEX I \J K X (P X , Py)f)R y (P x , Py)\ . (24) 

\Px,Py j \Px,Py j 

We shall soon see in the next section that this result can be improved. 

C. Beyond simple time -sharing: "Uniform" time-sharing 

In this section we give a time-sharing coding scheme that was first developed by Gallager [11] and later 
further studied for universal decoding by Pokorny and Wallmeier [14] to get better error exponents for 
MAC channels. This type of "uniform" time-sharing schemes not only achieves better error exponents, 
more importantly, we show that this achieve bigger capacity region than the simple time-sharing 
scheme does for interference channels ! Unlike the multiple-access channels where the simple time-sharing 
achieves the whole capacity region, this is unique to the interference channels, due to the fact that the 
capacity region is the convex hull of the intersections of pairs of non-convex regions (convex or not is 
not the issue here, the real difference is the intersection operation). 

The organization of this section parallel to that for the fixed-composition. We first introduce the 
"uniform" time-sharing coding scheme, then give the achievable error exponents and lastly drive the 
achievable rate region for such coding schemes. The proofs are omitted since they are similar to those 
for the randomized fixed-composition codes. 

Definition 3: "Uniform" time-sharing codes: for a probability distribution Pjj on U, where hi = 
{ui,U2, ■■■,uk} with Y^d=i Ru( u j) = 1> an d a pair of conditional independent distributions P x \u, Py\u- 
We define the two codeword setij as 

V (n\ — i>r n ■ ^ nPu ( ul ) a p , n( p u{ui)+Pu(u 2 )) p n p i 

X c {n) - {X .X 1 e ^X\u- L >X nPu ( Ul ) +1 t Px\u 2 , -i X n{l- Pu( Ul )) G ^X\ulS 

2 The error exponent is, however, at most half of the individual error exponent. 
3 Again, we ignore the nuisance of the non-integers here. 
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I(X-Z\Y,U) 



Fig. 4. "Uniform" time-sharing capacity region TZ X (PuPx\uPy u ) for X, the achievable region is the union of Region / 
and //. This region is very similar to that for fixed-composition coding shown in Figure [2] only difference is now there is an 
auxiliary time-sharing random variable U. 



i.e. the i'th chunk of the codeword x n with length nPu(ui) has composition Px\ui> an d similarly 

Yc{n) - {y .y 1 G ^Y\ Ul ,y nPu{ui)+1 € Py\u 2 , ■■■,y n ( 1 - Pu ( Ul )) G nr\u L J- 

A "uniform" time-sharing code (R x , R y , PjjPx\uPy\u) encoder picks a code book with the following 
probability: for any message m x G {1, 2, 2 nPx }, the code word x n (m x ) is uniformly distributed in 
X c {n), similarly for encoder Y. 

After the code book is randomly generated and revealed to the decoder, the decoder uses a maximum 
mutual information decoding rule. Similar to the fixed-composition coding, the decoder needs to either 
decode both message X and Y jointly or simply treats Y as noise and decode X only, depending on 
where the rate pairs are in Region / or //, as shown in Figure |4] The error probability we investigate is 
again the average error probability over all messages and code books. 

Theorem 2: Interference channel capacity region 1Z x (PjjP x \uPy\u) f° r "uniform" time-sharing codes 
with composition PuP x \jjPy\u'- 

TZ x (PuP xlu P Ylu ) = {(R x ,Ry):0<R x <I(X;Z\U),0<Ry} [j 

{(R x ,Ry) :0<R X< I(X; Z\Y, U),R X +R y < I{X, Y; Z\U)} (25) 

where the random variables in d25T ). (U, X, Y, Z) ~ PuPx\uPy\u^z\x,y- And the interference capacity 
region for PtjPx\U p Y\U is 

K xv {PuP X \uPy\v) = K x (PuP X \uPY\u)r\ n v( P U P X\uPY\u) (26) 

The rate region defined in d25l ) itself does not give any new X-capacity regions for X, since both 
Region / and II in Figure [4] can be achieved by simple time-sharing of Region / and 77 repectively 
in (01). But for the interference channel capacity, we argue in the next section that this coding scheme 
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gives a strictly bigger capacity region than that given by the simple time-sharing of fixed-composition 
codes in d24l . 

The proof of Theorem [2] is similar to that of Theorem Q] We omit the details here. We only point 
out that the achievability part is proved by deriving a positive error exponent for rate pair in the interior 
of the capacity region defined in Theorem [2] As shown in [14] and also detailed in this paper for the 
randomized coding, the error exponents in Region II of in Figure |4] is: 

E = miD.{E xy , E x \ y , E y \ x }, where 



E 


= min 

QxYZ | U -Qx | U — Px 1 U ) Q Y 1 


u — Py\u 




D(Q zixy \\W\Qxyu 


)+D(Q 


E i 
^x\y 


= min 

Qxyz | u -Qx | u — Px | u ,Qy I 


u=Py\u 




D(Q z \xy\\W\Qxyu 


)+D(Q 


Ey\ x 


= min 

QxYZ | U -Qx | U — Px | U iQy 1 


u=Py\u 




D(Q z \xy\\W\Qxyu 


) + D(Q 



This is the error exponents in Lemma Q] with a conditional auxiliary random variable U. 
The error exponent in Region / is 

E x = min 

Qxyz\u'-Qx\u = Px\u >Qy\u = Py\u 

D(Qz\xy \\W\Qxyu) + D(Q XY \u\\Px\u * Py\uP) + \I Q (X; Z\U) - R x \ + 
D. Why the "uniform" time sharing is needed? 

It is obvious that the "uniform" time-sharing fixed-composition coding gives a bigger error exponent 
than the simple time-sharing coding does. More interestingly, we argue that it gives a bigger interference 
channel capacity region. First we write down the interference channel capacity region generated from the 
basic "uniform" time-sharing fixed-composition codes: 

CONVEX |J H*y{PuPx\uPY\u) ■ (27) 

\PxiuPyiuPu I 



where TZ xv {PjjPx\uPy\u) i s defined in (l26l ) and CONVEX(A) is the convex hull (simple time sharing) 
of set A. 

U is a time-sharing auxiliary random variable. Unlike the MAC coding problem, where simple time- 
sharing of fixed-composition codes achieve the full capacity region, it is not guaranteed for interference 
channels. The reason is the intersection operator in the basic building blocks in ® and (l26l ) respectively, 
i.e. the interference nature of the proble nfl 

Obviously the rate region by simple time sharing of fixed composition code in d24l) is a subset of 
simple time sharing of the "uniform" time sharing capacity region d27l) . In the following example, we 
illustrate why (l27l) is bigger than (1241) . 



4 To understand why intersection is the difference but not the non-convexity, we consider four convex sets: A\, A2, Bi, B%. 
We show that CO NVEX ( Ai f] Bi, A 2 f| B 2 ) can be strictly smaller than CONVEX(A 1 ,A 2 )f]CONVEX{Bi,B 2 ). 
Let A x = B 2 C £1 = A 2 , then CONVEX(A 1 f]B 1 ,A 2 f]B 2 ) = Ai is strictly smaller than 
CONVEX(Ai, A2) f~l CONVEX(B\, B2) — A2. This shows why uniform time-sharing gives bigger capacity region. 
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Fig. 5. Simple timesharing of fixed-composition capacity ABCDO VS time-sharing fixed composition capacity(0.5) ( the 
black pentagon) 

Example: Suppose we have a symmetric interference channel, i.e. 1Z x {Px,Py) = 7Zy(Py,Px) for 
all Px , Py where T is the transpose operation. The comparison of simple timesharing capacity region and 
the more sophisticated time-sharing fixed-composition capacity region are illustrated by a toy example 
in Figure [5] 

For a distribution (Px,Py), the achievable region for the fixed-composition code is illustrated in 
Figure [5J 1Z x (Px, Py) and lZ y (Px , Py) respectively, these are bounded by the red dotted lines and 
red dash-dotted lines respectively, so the interference capacity region !Z xy (Px, Py) is bounded by the 
pentagon ABEFO. By symmetry, lZ x (Py,Px) and lZ y {Px,Py) are bounded by the blue dotted lines 
and blue dash-dotted lines respectively, the capacity region TZ x y(Py, Px) is bounded by the pentagon 
HGCDO. So the convex hull of these two regions is ABCDO. 

Now consider the following timesharing fixed-composition coding P x \uPy\uPu where U = {0, 1}, 
Pu{®) = PuiX) = 0.5 and Px\o = Py\i = Px, Px\i = Py\o = Py- The interference capacity region 
is obviously bounded by the black pentagon in Figure [5] This toy example shows why d27l) is bigger 
than (l24l 



The most interesting question about interference channel is the geometry of the two code books. 
For point to point channel coding, the code words in the optimal code book is uniformly distributed 
on a sphere of the optimal compositions and the optimal composition achieves the capacity. For MAC 
channels, a simple time-sharing among different fixed-composition codes is sufficient and necessary 
to achieve the whole capacity region, meanwhile for each fixed-composition codes, the codewords are 
uniformly distributed. However as illustrated in Section UV] a more interesting "uniform" time sharing 
is needed. So what is time sharing? Both simple time sharing and "uniform" time sharing change the 
shape of the code books, however, in different ways. Simple time sharing "glue" segments of code words 
together due to the independence of the coding in different segments of the channel uses, meanwhile for 
"uniform" time sharing, code words still have equal distances between one another. Better understanding 



V. Future directions 
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of the shape of code books may help us understand the interference channels. Also in this paper, we give 
our first attempt at giving an outer bound of the interference channel capacity region. We only manage 
to give a tight outer bound to the time-sharing fixed-composition code. An important future direction is 
to categorize the coding schemes for interference channels and more outer bound result may follow. This 
is in contrast to the traditional outer bound derivations [1] where genie is used. 
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Appendix 

A. Proof of (Hi, (E?J and £76]) 

We give a unified proof in lower bounding the error probability for randomized fixed-composition 
coding, where the error probabilities in (fT3T ). (fl4l ) and (fT6l ) are taken over all messages, code books and 
channel behaviors. We examine the object function to be minimized in ( fT3T ), (fT4l ) and (fT6l ). 

First, the common part of the three error exponents E xy , E x \ y and E x : D(Qz\xy\\W\Qxy)+D{Qxy\\Px x 
Py). D(Qxy \\Px x Py) is the logarithm of the inverse of the probability that type Qxy is the empirical 
distribution of the code pair x n (l),y n (l) individually generated from fixed-compositions Px and Py. 
D(Qz\xy \\W\Qxy) is logarithm of the inverse of the conditional probability that the input to the channel 
W is Qxy, while the empirical type of the input/output is Qxyz = Qxy x Qz\xy- 

Secondly for the individual part of the error exponents in (fT3T). (TBI ) and (fT6l): \Iq(X,Y; Z) — R x — R y \ + , 
\Iq(X; Z\Y) — R x \ + and \Iq(X; Z) — R x \ + respectively, each one is the logarithm of the inverse of an 
upper bound on the probability that there exists another message (pair) with higher mutual information 
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with the channel output, while the channel inputs/ouput has type Qxyz- This is derived by a union 
bound argument. We now give the details of the proofs. 



1 ) Proof of fu3\l : Because of the symmetry of the code book selection, we can fix the message pair 
(m x ,m y ) = (1,1) and write the error probability (fT"3l) in the following way: 

Pr(m x ^ fh x ,m y / m y ) 

WHP^)\) \\T n {Py)\) (28) 
^7 Y ^7 YYW z \ XY {z n \x n {m x ),y n (m y ))l(m x {z n ) ^ m x ,m y {z n ) ^ m y ) 



|r-(p x )iy Vl^ n (^v)L 

EEE^^I 1 ^ 1 )^^ 1 )) 1 ^^) ^ l,^(^) 1) 

c x c Y z n 

Y {Pr((x n (l),y n (l))eQ XY ) Y Pr(^l(x"(l),y"(l))GQ Z | Xy ) 

Qxy -Qx = Px iQy = Py Qz\xy 

Pr(m x (z n ) ^ l,m y {z n ) ^ 1)} (29) 
< {Pv((x n (l),y n (l))eQxY) Y Pr(z n \(x n (i),y n (i))eQz\xY) 



}XY -Qx—Px iQy—Py 

2 nR x 2 nR v 



min{l, Y Y Pr(I(z n ;x n (l),y n (l)) < I(z n ;x n (i),y n (j))\(x n (l),y n (l),z n ) G Qxyz))}} 

i=2 j=2 

< \T£ YZ \ max Pr((x n (l),y n (l))eQ XY )Pr(z%x n (l),y n (l))eQ zlXY ) (30) 

Q xy z'-Q x—Px ,Qy—Py 

min{l, ^ ^ Pr(/(z";x n (l),y n (l)) < z n «, y"(j))|(* n (l), y n (l), z n ) G Qxyz))} 

t=2 j=2 

d28l ) and d29l are two different interpretations of the same error probability. In d28l ), we first randomly 
pick a fixed-composition code book pair c x and c Y , then sum over the all probabilities that the output 
of the channel causes a decoding error for the chosen code book pair. d29l is an equivalent interpretation 
of the above error probability because the codewords for each message is independently generated. We 
interpret d29l as follows, we first randomly pick a codeword pair for message 1 in X and message 1 in 
Y, then the codeword pair is transmitted to through the channel. Then we randomly generate the rest of 
the code book and investigate the probability that other message pairs maximize the mutual information 
with the channel output. We upper bound the four terms in d30l) individually in d3"Tl ), d32l , d33l and 041 ). 
First, the number of type sets of length n: 

\ T £ YZ \ < (n + l)\X*y*Z\ = 2 n{^ m \XxyxZ\) = 2 na„_ (31) 

Secondly, for any Qxy, s.t. Q x = Px and Q Y = P Y , from the method of types [2] and [3], we know 
that 2 n ^ H ^- 1 - £ fr^ < \V Y \ < 2 nH ^ Y \ similar bounds applies to \V x \. And for a fixed X-sequence, 

x n (l) G P x = Qx, we have 2 "W ( 2^ x )- l£ ^l* y l) < \{y n e y n : {x n (l),y n ) G Qxy}\ < 2 nH ^ x \ 
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x n (l) and y n (l) are independently distributed in type set Px and P Y . Hence, 

Pr ((x n (l),y n (l)) G Q XY ) = € — ^j'^ £ Qxy}l < 2"™^)-^) +l£ ^l*l) 

Notice that H(Q Y[X )-H(Q Y ) = -D(Q XY \\Q x xQ Y ) = -D(Q XY \\P X xP Y ) and let b n = l ^\X\, 
we have: 

Pr ((x n (l),y n {l)) G Q XY ) < 2-< D ^^W p - xP ^- b ") (32) 
Thirdly, For (x n (1) , y n (1)) G Qxy, for any empirical channel behavior Qz\xy'- 

Pr(z"|(*"'(l),^(l)) G Q z|xy ) = |{*» : (x n (l),y n (l), z n ) G Qxyz}|^z|XY (Q zpcy ) 

< 2 n ' H "( < 9z|A-v) x 2™(-- D (<9z|A-v||W|Qxi')-H(Q Z |xi')) 

_ 2 _n - D ( ( 3 z i X5 'll H/ l < 3 x ' 1 ') (33) 

Finally, for (x n (l), y n (l), z n ) G Qxyz, we investigate the probability that there exists (i, j), i ^ 1, j ^ 
1, s.t. the mutual information between (x n (i), y n (j)) and z n is at least as much as the mutual information 
between (x n (I) , y n (I)) and z n . For all i ^ 1, the codeword x n (i) is uniformly distributed on the fixed- 
composition set Px, same for Y. Given (x n (I) , y n (I) , z n ) G Qxyz, we have I (z n ; x n (I) , y n (I)) = 
I Q (Z;X,Y),so: ' 

2" H i 2 nR y 

min{l, £ £Pr(I(z»;a:»(l),y n (l)) < I(z n ;x n (z),y n (j))\(x n (l),y n (l),z n ) e Qxyz)} 

i=2 j=2 

<min{l,2^ + ^) £ 

Vxr Z :Vx=Qxyr=Qr,V z =Q Z ,-fQ(2;X,y)</v(^;X,y) 

Pr(((x"(i),y"(j),z n ) G Fxyzk n G Q z )} 

= min{l,2^ +i? «) 

Vxr Z :Vx=Qx,VV=Qi-,V z =Q Z ,-fQ(Z;X,y)</v(Z;X,y) 

|{(g",j/ n ) G P x x Py : (rr",j/",z") G 1^ 
|{a; n : x n G Px}||{y n : y n G Py}| 

< min{l,2 ri ^ +/? ") ^ 

V xyz :V x =Q x ,V y =Q y ,V z =Qz,Iq(Z;X,Y)<Iv(Z;X,Y) 

2n(H v {X,Y\Z)-H v (X)-H v (Y)+ losnV *l +iy » )j 

<min{l,2"(^ +/? ») ^ 

Vxy Z :Vx=Qx,yy=Qr,V z =Q z ,/Q(2;X ! y)</v(^;X,y) 

2n(Hv{X,Y\Z)-H v (X,Y)+ '° &nQX J + lyl) )j 

= rmn{l, 2 n ^ +i ^ 2 n(-j v (x,y;Z)+ 1 ° 8 "^'+' y " )| 

V X vz:V x =Q x ,Vr=Qr,V z =Q z ,/ Q (^;X,y)</ v (^;X,y) 

< min{l, 2"(^+ R «)n l ' yxyxg| 2 n( " /q(x - y;Z)+ ' OS " ( '^' + ' y " ) } 

= 2 -n(|7 Q (X,y;Z)-iJ x - J R B | + -c„) (34) 

Substituting (OTT ). (1321 ). (1331 and (1341 ) in (f30b - and noticing that a n 6 n and c n converges to zero when n 
goes to infinity, ( fT3l is proved. 
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2 ) Sketch of the proof of and rfi6l ).- (fT4l ) and (fT6l ) can be proved by following the same argument 
in proving ( fT3T ). Similar to how we upper bound the LHS of ([TBI in (|30l ), we upper bound the LHS of 
d by: 

Pr(m x ^ fh x \m y = fh y ) 

< \n YZ \ max Pr((x"(l),y"(l))GQxy)Pr(^|(x"(l),y"(l))GQ Z | Xy ) 

min{l, J] Pr (/(*"; x n (l),y n (l)) < y"(l))|(^(l), y"(l), z") G Qxyz))}-(35) 

i=2 

and the LHS of <Q2]> by 

Pr(m 2: ^ mj) 

< |T^ Z | max Pr((x"(l),y"(l))GQ xy )Pr(^|(x"(l),y"(l))GQ Z | Xy ) 
min{l, ^ Pr(/(z";x"(l)) < I(z n ; x n (i))\(x n (1) , y n (l), z n ) G Qxyz))}- (36) 

i=2 

The common parts (the three terms on the first line) in d35l ) and (|36l ) are upper bounded the same way 
as those in (|3"TT ) (l32l ) and (l33l for (l30l ). The individual part (the min{l, •} term on the second line) of 
(1331 ) and (1361 are upper bounded by a similar argument for upper bounding the individual part of (|3Qb 
shown in (|33T ). We omit the details here. □ 

£. Proof o/ (tZTD an J (E2J» 

We give a constant lower bound, \, on the error probabilities Pr(m x / m x ) and Pr((m a; ,m ?; ) 7^ 
(m x , m y )) in (fT71 ) and (l23l respectively. The technical details of lower bounding Pr(fn z 7^ m x ) is carried 
out in Appendix IB. 11 We extend the two very technical Lemmas 5 and 3 from [6] into Lemmas [6] and [7J 
respectively, where Lemma [7J is used to prove Lemma [6] The proof of lower bounding Pr ((fh x ,fh y ) 7^ 
(m x ,m y )) is similar, we only give the necessary definition of jointly good code books in Appendix IB .21 

The difference between the setups in this paper and that in [6] is that we are dealing with an interference 
channel instead of a memory less channel in [6]. Hence a notion of the conditionally typical code book 
in the proof of ( fT71 ) and jointly typical code book in the proof of d23l ) is necessary in the proofs. 

1) Proof of ff77l): we give an upper bound of the correct decoding probability Pr(m x = m x ) = 
1 — Pr(m x . ^ m x ) and hence prove the lower bound on Pi(m x 7^ m x ) in ( fT71 ) . 

Pr(m x = m x ) = P^ x) (R x ,R y ,P x ,Py) 

E E E W zlXY (z n \x n (m x ),y n )l(m x (z n ) = m x ) 

Cx Tlx Z n 

The codewords x n (m x ) is uniformly distributed on the type set Px, so the probability that the joint type 
of (x n (m x ),y n ) is close to Px x Py with high probability [2], i.e. for all a > 0, for large n, 

PiiD((x n (m x ),y n )\\P x x Py)) > a) < a. (37) 

We denote by T a (y n ) = {x n : D((x n , y n )\\Px x Py)) < a}, the typical set conditional on y n . We say 
a code book cx is good conditional on y n if 

\c x f]T^(y n )\<\^i (38) 
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where \cx\ = 2 nR * . The set of all good code books is denoted by G, at most 4<r of the code books are 
not in G because of d37l ). For a good code book cx, we use the technique from [6] to upper bound the 
correct probability for the good code book cx- 

Pv(fh x = m x ) < \ c x^{y n )\ + l y, = 

\cx\ \cx\ 

< ^ +2 -n(E-e n ) (39) 

where e n goes to zero with n, and 

E= mm D(Q zixy \\W Z xy\Qxy) + \R x -Iq(X;Z\Y)\ + 

Qxrz-D(Q XY \\P x xP Y )<a 

where ( f39b is proved by Lemma [6] which is an extension of Lemma 5 in [6] from memoryless to 
conditional on y n . 

Following the argument in Lemma [3] it is easy to see that E > for R x > I(X; Z\Y) and small a, 
where (X, Y, Z) ~ W z \ 

xy x Px x Py • Now we have 

/ 1 \ 2 nfix 

Pr(m x = m x ) = I , , j ( ^ Pr(m x = m x ) + ^ Pr(m :r = m x )) 

< - + 2" n ( i? - e " ) + 4<T (40) 

Let a be small enough and let n goes to infinity, so Pr(m x ^ m x ) = 1 — Pr(m x = m x ) > i. (fTTT ) is 
proved. □ 

The following two Lemmas [6] and |7] are extensions of Lemma 5 and 3 in [6] respectively. They contain 
the technical details in the proof of d39l ). 

Lemma 6: Extension of Lemma 5 in [6] from memoryless to conditional on y n , for a good code book 
c x 6 G defined in ([38]). Recall that \c x f] T Ay n )\ > ^ = I x 2ni?x > then for an y decoding rule 
(previously known as fh x ) 4> '■ Z n — > {1, 2, 2 nRx }, 

J_ y p r (i = (j)(z n )) < 2" n ( £ - e ") (41) 

\Cv\ 

where E = min D(Qz|xy ||WWy |Qxy) + |ik - Iq(X; Z\Y)\ + 

Qxyz:D(Q xy \\P x xP y )<o- 1 1 

and e n = e(\X\, \y\, \Z\,n) which converges to zero as n goes to infinity. 

Proof: We write M = {i £ {1, 2, 2 nR *} : x n (i) G T^y™)} then we know that from the definition 
of a good code book: f x 2 nR * < \M\ < 2 nR * = \c x \- Notice that 

Pr(i = ^»)) = £ ^z|xy(2 n |x n «,y n ) = ^|xy(r 1 W|x n «,y n ) (42) 
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We rewrite the LHS of (|4TT) : 

= 2~ nR * W zlXY (<p-\i)\x n (i),y n ) 

i:x"(i)GT„(y") 

= Yl I E W z \ XY {r l (i)\x n {i),y n ) 

Qxy:D(Qxy\\PxxP y )<(7 \i:(x"-(i),y n )eQ X Y 

= (n + l)\ x W y \ max 

Qxy.D(Qxy\\Px-xPy)<o- 
i:(x"(i),y n )&QxY Qz\xy 

< ( n + i)WI+Wll*| 

Qxyz:D(Q xv ||PxXP^)< ( 7 
i:(a:"(i),2/")e<2xy 

< 2 ne "( 1) max 

<9xy^:D(Qxy||PjtXiV)«7 

< 2 ne "( 1) max 

0xy^:D(Qxy||PjtXiV)«7 

-nfl(Q z|X v||W Z| x.|Qx y ) 2 -n^ V- (^(*), I/") H 0" 1 (*) I 

i: (x n yi) ,y n ) &Qxy 

< 2 ne ^ max (2-"D(QzixY\\W zl xY\QxY) 2 -n\R-I Q (X;Z\Y)-e n (2)\+\ ^ 

Qxyz.D(Qxy\\PxXPy)<(t\ J 

= 2~ n ( E ~ € ^ (44) 
where (|43T ) follows Lemma |7] The rest are obvious by the method of types. □ 

Lemma 7: Extension of Lemma 3 in [6] from memory less to conditional on y n , for any R > R x > 0, 
for any coding system X(y n ) with joint input distribution (x n (i),y n ) 6 Qxy, i = 1,2, ...2 nR *, and 
decoding rule (j) : Z n -> {1, 2, 2 ni? * }, let Q z \XY{x n (i),y n ) = {z n : {x n {i),y n , z n ) € Qxyz} (this is 
the V-shell notation Ty used in [6]), we have: 

J_ v \Qz\XY^U n )^\i)\ 2 - nlR - lQ{X;ZlY y^ 

t nR ti \Qz\x Y (x n W,y n )\ ~ 

where e n = e(n, 1^1, \Z\) converges to zero as n goes to infinity. 
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Proof: Write Qz\Y(y n ) = I 2 ™ : (y n >z n ) € Qzy}- By the method of types [3], we know that 

(n + i)-|2| 2 ntfQ(z|XY) < |Q Z | Xy (x n (i),?/ n )| < 2 n ^( z l xy ) 

and { n + \y\ z h nH ^ z ^ < \Q z \ Y {y n )\ < 2 nH «( z l y ). 
So the LHS of d45l) is upper bounded by 

1 y \Qz\XY(x n (i),y n )f]^Hi)\ 

2 nR fr! \Q Z \x Y {x n {i),y n )\ 

< (n + 1)1^12-^(^)2-^ £ iQzprcO^O^fV"" 1 ®! 

< (?i + l)l 2 l2-"^( z l xy )2^ nR |Q Z | y (y n )| (46) 

< (ti + i)|2| 2 -nH Q (z|xy) 2 -„i? (n + ^l^n/MZir) 

= 2 -n(ii-/g(X;2|y)-e n ) (47) 

© is true because Q z \ XY {x n (i), y n ) f| _1 (*). * = !> 2 > -> 2"^ are disjoint and (Jj Qz|xy (z n («), y n ) C 
Qz\Y{y n )- Now notice that the LHS of <@5]> is at most 2 n ^~ R "> < 1, hence the LHS of (05]) is no bigger 
than 1. This together with (|47T ). Lemma [7] is proved. □ 

2) Proof of j[23]) : The proof is similar to that of (fTTl ). The difference is that we need the notion of 
jointlyg good code books. A code book pair (c x ,cy) is good if 

\c x xc Y f)T?\< l -^pd (48) 

where the joint typical set T a = {(x n ,y n ) : D((x n , y n )\\Px x Py) < a}. The rest of the proof are 
similar to that in the proof for ( fTTl ). We conclude that 

Pr((m x ., my) = K, m y )) < - + 2~< E ~^ + 4cr (49) 

where E= min D(Q Z]XY \\W z]xy \Q X y) + \R-Iq{X, Y; Z)\ + > 0, for R x +Ry > 

Qxyz:D{Qxy\\PxXPy)<o- 

I(X,Y;Z). 

Again, we need to use a modified version of Lemma 5 and 3 from [6] to prove d49l ). The proof is 
extremely similar to those in Lemma [7] and [6) We omit the details here. □ 
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